Daghestanian loans database
Authors: Ilya Chechuro, Michael Daniel, and Samira Verhees.
This database contains wordlists collected as part of the Daghestanian loans project by the Linguistic Convergence Laboratory at NRU HSE. The aim of the 160-item shortlist, which is based on the World Loanword Database questionnaire, is to measure lexical contact on a micro-level. In other words, to quantify lexical convergence among the speech communities of minority languages on a village-level, and to detect fine-grained areal patterns beyond general observations on the spheres of influence of certain languages.
Contents:
[,1]
target_words 25796
languages 23
How to cite this project
If you use data from the database in your research, please cite as follows:
Chechuro I., Daniel M., Dobrushina N., and Verhees S. 2019. Daghestanian loans database. Linguistic Convergence Laboratory, HSE. (Available online at https://lingconlab.github.io/Dagloan_database/DL_database.html, , accessed on April 29, 2019.)
The database
For now, the table shows source Concepts and target Words. Each target word is grouped in a similarity Set - a set of words that have the same meaning and look similar. In the future, data will be added on borrowing sources. Metadata includes the name of the Village where the word was recorded, the administrative District it is part of, the Language spoken there, and the List ID: these ID’s correspond to a particular speaker or in some cases a written source like a dictionary. Data is accessible at: Github/LingConLab/DagloanDatabase.
The dataset in the dummy format is available here.
Version: 2019-04-29. For questions or comments contact jh.verhees@gmail.com.
Map of the surveyed villages
Hover over and / or click on a dot on the map to know more. The color of the dots corresponds to the number of lists collected in a village. Orange = dictionary data.
Sample lexical map
The map below shows the distribution of different stems for the concept ‘pepper’.
Sources of lexical influence
Cluster Dendrogram of Foreign Influence
This tree is built as follows. 0 distance is given only to two matching non-empty cells, otherwise the distance is 1. The NA’s are not counted.
Speaker Language Village District Alibeglo1 Arkhit1 Arkhit2 Arkhit3
Arkhit4 Arkhit5 Arkhit6 Bezhta1 Darvag1 Darvag2 Darvag3 Darvag4
Darvag5 Darvag6 Dyubek1 Dyubek2 Dyubek3 Dyubek4 Dzhavgat1 Dzhavgat2
Dzhavgat3 Dzhavgat4 Dzhibakhni1 Dzhibakhni2 Dzhibakhni3 Dzhibakhni4
Helmets1 Helmets2 Helmets3 Ikhrek1 Ikhrek2 Ikhrek3 Ikhrek4 Ilisu1
Karata1 Karata2 Karata3 Karata4 Khapil1 Khapil2 Khapil3 Khapil4
Khapil5 Khiv1 Khiv2 Khiv3 Khiv4 Khlut1 Khlut2 Khlut3 Khlut4 Khlut5
Khoredzh1 Khoredzh2 Khoredzh3 Khoredzh4 Khoredzh5 Khoredzh6 Khutkhul1
Khutkhul2 Khutkhul3 Khutkhul4 Kiche1 Kiche2 Kidero1 Kidero2 Kidero3
Kina1 Kina2 Kina3 Kurag1 Kusur1 Laka1 Laka2 Laka3 Laka4 Laka5 Laka6
Meshabash1 Meshabash2 Mikik1 Mikik2 Qax1 Qax2 Qax3 Qax4 Qax5 Qax6
Qax7 Qax8 Qax9 Qum1 Qum2 Rikvani1 Rutul1 Tad-Magitl1 Tad-Magitl2
Tatil1 Tatil2 Tatil3 Tatil4 Tatil5 Tlibisho1 Tlibisho2 Tlibisho3
Tlibisho4 Tpig1 Tsinit1 Tsinit2 Tsinit3 Tsinit4 Tsinit5 Tukita1
Yagdyg1 Yagdyg2 Yagdyg3 Yagdyg4 Yagdyg5 Yagdyg6 Yersi1 Yersi2 Yersi3
Yersi4 Zilo1 Zilo2
[ reached 'max' / getOption("max.print") -- omitted 125 rows ]
Cluster Dendrogram of Foreign Influence (Strict Distances)
This tree is built as follows. 0 distance is given only to two matching non-empty cells, otherwise the distance is 1. This leads to the huge distances even if speakers are similar. The NA’s are counted.
Speaker Language Village District Alibeglo1 Arkhit1 Arkhit2 Arkhit3
Arkhit4 Arkhit5 Arkhit6 Bezhta1 Darvag1 Darvag2 Darvag3 Darvag4
Darvag5 Darvag6 Dyubek1 Dyubek2 Dyubek3 Dyubek4 Dzhavgat1 Dzhavgat2
Dzhavgat3 Dzhavgat4 Dzhibakhni1 Dzhibakhni2 Dzhibakhni3 Dzhibakhni4
Helmets1 Helmets2 Helmets3 Ikhrek1 Ikhrek2 Ikhrek3 Ikhrek4 Ilisu1
Karata1 Karata2 Karata3 Karata4 Khapil1 Khapil2 Khapil3 Khapil4
Khapil5 Khiv1 Khiv2 Khiv3 Khiv4 Khlut1 Khlut2 Khlut3 Khlut4 Khlut5
Khoredzh1 Khoredzh2 Khoredzh3 Khoredzh4 Khoredzh5 Khoredzh6 Khutkhul1
Khutkhul2 Khutkhul3 Khutkhul4 Kiche1 Kiche2 Kidero1 Kidero2 Kidero3
Kina1 Kina2 Kina3 Kurag1 Kusur1 Laka1 Laka2 Laka3 Laka4 Laka5 Laka6
Meshabash1 Meshabash2 Mikik1 Mikik2 Qax1 Qax2 Qax3 Qax4 Qax5 Qax6
Qax7 Qax8 Qax9 Qum1 Qum2 Rikvani1 Rutul1 Tad-Magitl1 Tad-Magitl2
Tatil1 Tatil2 Tatil3 Tatil4 Tatil5 Tlibisho1 Tlibisho2 Tlibisho3
Tlibisho4 Tpig1 Tsinit1 Tsinit2 Tsinit3 Tsinit4 Tsinit5 Tukita1
Yagdyg1 Yagdyg2 Yagdyg3 Yagdyg4 Yagdyg5 Yagdyg6 Yersi1 Yersi2 Yersi3
Yersi4 Zilo1 Zilo2
[ reached 'max' / getOption("max.print") -- omitted 125 rows ]